AITopics | logistic pca

Collaborating Authors

logistic pca

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

word2vec Skip-Gram with Negative Sampling is a Weighted Logistic PCA

Landgraf, Andrew J., Bellay, Jeremy

arXiv.org Machine LearningMay-26-2017

Mikolov et al. (2013) introduced the skip-gram formulation for neural word embeddings, wherein one tries to predict the context of a given word. Their negative-sampling algorithm improved the computational feasibility of training the embeddings. Due to their state-of-the-art performance on a number of tasks, there has been much research aimed at better understanding it. Goldberg and Levy (2014) showed that skip-gram with negative-sampling algorithm (SGNS) maximizes a different likelihood than the skip-gram formulation poses and further showed how it is implicitly related to pointwise mutual information (Levy and Goldberg, 2014). We show that SGNS is a weighted logistic PCA, which is a special case of exponential family PCA for the binomial likelihood. Cotterell et al. (2017) showed that the skip-gram formulation can be viewed as exponential family PCA with a multinomial likelihood, but they did not make the connection between the negative-sampling algorithm and the binomial likelihood. Li et al. (2015) showed that SGNS is an explicit matrix factorization related to representation learning, but the matrix factorization objective they found was complicated and they did not find the connection to the binomial distribution or exponential family PCA.

artificial intelligence, factorization, machine learning, (16 more...)

arXiv.org Machine Learning

1705.09755

Country: North America > United States > Ohio > Franklin County > Columbus (0.05)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Dimensionality Reduction for Binary Data through the Projection of Natural Parameters

Landgraf, Andrew J., Lee, Yoonkyung

arXiv.org Machine LearningOct-20-2015

Principal component analysis (PCA) for binary data, known as logistic PCA, has become a popular alternative to dimensionality reduction of binary data. It is motivated as an extension of ordinary PCA by means of a matrix factorization, akin to the singular value decomposition, that maximizes the Bernoulli log-likelihood. We propose a new formulation of logistic PCA which extends Pearson's formulation of a low dimensional data representation with minimum error to binary data. Our formulation does not require a matrix factorization, as previous methods do, but instead looks for projections of the natural parameters from the saturated model. Due to this difference, the number of parameters does not grow with the number of observations and the principal component scores on new data can be computed with simple matrix multiplication. We derive explicit solutions for data matrices of special structure and provide computationally efficient algorithms for solving for the principal component loadings. Through simulation experiments and an analysis of medical diagnoses data, we compare our formulation of logistic PCA to the previous formulation as well as ordinary PCA to demonstrate its benefits.

deviance, logistic pca, matrix, (16 more...)

arXiv.org Machine Learning

doi: 10.1016/j.jmva.2020.104668

1510.06112

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Ohio (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Health Care Providers & Services (0.92)
Health & Medicine > Therapeutic Area (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.60)

Add feedback